71 research outputs found

    Process Algebra, CCS, and Bisimulation Decidability

    Get PDF
    Over the past fifteen years, there has been intensive study of formal systems that can model concurrency and communication. Two such systems are the Calculus of Communicating Systems, and the Algebra of Communicating Processes. The objective of this paper has two aspects; (1) to study the characteristics and features of these two systems, and (2) to investigate two interesting formal proofs concerning issues of decidability of bisimulation equivalence in these systems. An examination of the processes that generate context-free languages as a trace set shows that their bisimulation equivalence is decidable, in contrast to the undecidability of their trace set equivalence. Recent results have also shown that the bisimulation equivalence problem for processes with a limited amount of concurrency is decidable

    Exceptional Case Marking in the Xtag System

    Get PDF

    A Part-of-Speech Tagger for Yiddish: First Steps in Tagging the Yiddish Book Center Corpus

    Full text link
    We describe the construction and evaluation of a part-of-speech tagger for Yiddish (the first one, to the best of our knowledge). This is the first step in a larger project of automatically assigning part-of-speech tags and syntactic structure to Yiddish text for purposes of linguistic research. We combine two resources for the current work - an 80K word subset of the Penn Parsed Corpus of Historical Yiddish (PPCHY) (Santorini, 2021) and 650 million words of OCR'd Yiddish text from the Yiddish Book Center (YBC). We compute word embeddings on the YBC corpus, and these embeddings are used with a tagger model trained and evaluated on the PPCHY. Yiddish orthography in the YBC corpus has many spelling inconsistencies, and we present some evidence that even simple non-contextualized embeddings are able to capture the relationships among spelling variants without the need to first "standardize" the corpus. We evaluate the tagger performance on a 10-fold cross-validation split, with and without the embeddings, showing that the embeddings improve tagger performance. However, a great deal of work remains to be done, and we conclude by discussing some next steps, including the need for additional annotated training and test data

    Nominalization and Alternations in Biomedical Language

    Get PDF
    Background: This paper presents data on alternations in the argument structure of common domain-specific verbs and their associated verbal nominalizations in the PennBioIE corpus. Alternation is the term in theoretical linguistics for variations in the surface syntactic form of verbs, e.g. the different forms of stimulate in FSH stimulates follicular development and follicular development is stimulated by FSH. The data is used to assess the implications of alternations for biomedical text mining systems and to test the fit of the sublanguage model to biomedical texts. Methodology/Principal Findings: We examined 1,872 tokens of the ten most common domain-specific verbs or their zerorelated nouns in the PennBioIE corpus and labelled them for the presence or absence of three alternations. We then annotated the arguments of 746 tokens of the nominalizations related to these verbs and counted alternations related to the presence or absence of arguments and to the syntactic position of non-absent arguments. We found that alternations are quite common both for verbs and for nominalizations. We also found a previously undescribed alternation involving an adjectival present participle. Conclusions/Significance: We found that even in this semantically restricted domain, alternations are quite common, and alternations involving nominalizations are exceptionally diverse. Nonetheless, the sublanguage model applies to biomedica

    Using Higher-Order Logic Programming for Semantic Interpretation of Coordinate Constructs

    No full text
    Many theories of semantic interpretation use -term manipulation to compositionally compute the meaning of a sentence. These theorie
    • …
    corecore